Skip to content

feat: BYOE 0.8.0 - endpoints registry + openai-compat provider (REQ-142)#92

Merged
tbitcs merged 3 commits intodevelopfrom
feat/byoe-endpoints-store
May 4, 2026
Merged

feat: BYOE 0.8.0 - endpoints registry + openai-compat provider (REQ-142)#92
tbitcs merged 3 commits intodevelopfrom
feat/byoe-endpoints-store

Conversation

@tbitcs
Copy link
Copy Markdown
Contributor

@tbitcs tbitcs commented May 1, 2026

What

Ships Bring-Your-Own-Endpoint (BYOE) support for OpenAI-v1-compatible LLM backends (vLLM, llama.cpp server, LM Studio, TGI, ...). Closes the user request to route Specsmith chat / serve through a self-hosted vLLM on the LAN.

Phases

  • PR-1 commit (323fd30): endpoints store + CLI group.
    • src/specsmith/agent/endpoints.pyEndpoint / EndpointAuth / EndpointStore / EndpointHealth dataclasses; schema_version=1; JSON persistence at ~/.specsmith/endpoints.json with chmod 600; token resolution dispatch (none / bearer-inline / bearer-env / bearer-keyring); /v1/models health probe with TLS verify toggle.
    • specsmith endpoints group with add / list / remove / default / test / models subcommands. Inline-token redaction on --json, hidden-input prompt for keyring path, --purge-keyring on remove.
    • 38 new tests + docs/site/endpoints.md walkthrough + api_surface.json registers endpoints.
  • PR-2 commit (9ecd39e): provider driver + --endpoint flag.
    • _run_openai_compat in chat_runner.py streams from the registered endpoint via raw stdlib HTTP / SSE (no openai SDK dependency). run_chat takes an optional endpoint_id; when set, the BYOE store is consulted and the resolved endpoint short-circuits the auto-detect provider chain. Failure modes (unreachable, 401, missing default model) fall back gracefully.
    • --endpoint <id> flag on specsmith chat and serve. serve resolves the endpoint at startup, derives provider+model, and exports SPECSMITH_ACTIVE_ENDPOINT.
    • 4 new e2e tests against an in-process fake /v1/chat/completions SSE server.
  • Release commit (f155fa4): pyproject.toml → 0.8.0.

Validation

  • ruff check + ruff format --check clean for the new and modified files.
  • mypy clean for src/specsmith/agent/endpoints.py (the strict-mode tier).
  • pytest tests/test_endpoints_store.py tests/test_endpoints_cli.py tests/test_chat_runner_openai_compat.py tests/test_warp_parity_followup.py tests/test_warp_parity.py82 passing.

How to test on your workstation

  1. Pull / install specsmith 0.8.0 (or run the dev branch in editable mode).
  2. specsmith endpoints add --id home-vllm --name "Home vLLM" --base-url http://10.0.0.4:8000/v1 --default-model qwen2.5-coder --auth none --set-default.
  3. specsmith endpoints test home-vllm.
  4. specsmith chat --endpoint home-vllm "hello" — the response now streams from your vLLM, not Ollama / Anthropic / OpenAI.

Out of scope (PR-3 in the extension repo)

Co-Authored-By: Oz oz-agent@warp.dev

tbitcs and others added 3 commits May 1, 2026 07:42
Phase 1 of the Bring-Your-Own-Endpoint sprint. Adds a generic OpenAI-v1-compatible endpoint registry so users can register self-hosted vLLM, llama.cpp server, LM Studio, and TGI backends and pick between them.

- src/specsmith/agent/endpoints.py: Endpoint / EndpointAuth / EndpointStore / EndpointHealth dataclasses, schema_version=1, JSON persistence at ~/.specsmith/endpoints.json (chmod 600), token resolution dispatch (none / bearer-inline / bearer-env / bearer-keyring), /v1/models health probe with TLS verify toggle.

- src/specsmith/cli.py: 'specsmith endpoints' group with add / list / remove / default / test / models subcommands. Inline-token redaction in --json output, optional bearer-keyring storage with hidden-input prompt, --purge-keyring on remove, --set-default on add.

- tests/test_endpoints_store.py + tests/test_endpoints_cli.py: 38 new tests covering validation, round-trip, redaction, token resolution dispatch, and /v1/models health against an in-process fake server.

- tests/fixtures/api_surface.json: registered 'endpoints' as a top-level command for REQ-140 stability.

- docs/site/endpoints.md: BYOE walkthrough, auth strategy table, security notes, CLI reference.

Validation: ruff lint clean, ruff format clean, mypy strict clean for the new module, pytest 66/66 passing across the new suites + the existing api-surface stability test.

Co-Authored-By: Oz <oz-agent@warp.dev>
Phase 2 of the Bring-Your-Own-Endpoint sprint. Wires the registry from PR-1 into the chat surface and the persistent serve loop.

- src/specsmith/agent/chat_runner.py: new _run_openai_compat driver streams from a registered Endpoint via raw stdlib HTTP / SSE (no openai SDK dependency). run_chat() takes an optional endpoint_id; when set, the BYOE store is consulted and the resolved endpoint short-circuits the auto-detect provider chain. Failure modes (unreachable, 401, missing default model) fall back gracefully.

- src/specsmith/cli.py: 'specsmith chat --endpoint <id>' threads through to run_chat. 'specsmith serve --endpoint <id>' resolves the endpoint at startup, derives provider+model, and exports SPECSMITH_ACTIVE_ENDPOINT for downstream consumers.

- tests/test_chat_runner_openai_compat.py: 4 new pytest cases against an in-process fake /v1/chat/completions SSE server. Covers happy-path streaming, missing default-model fallback, 401-on-bad-token fallback, and the run_chat entry point with endpoint_id resolution.

Validation: ruff lint + format clean, 82/82 passing across the new + existing endpoint and warp parity suites.

Co-Authored-By: Oz <oz-agent@warp.dev>
Bump pyproject.toml to 0.8.0 to ship the Bring-Your-Own-Endpoint feature (REQ-142): the new endpoints store + 'specsmith endpoints' CLI group (PR-1) and the openai-compat provider driver wired through 'specsmith chat / serve --endpoint <id>' (PR-2).

Co-Authored-By: Oz <oz-agent@warp.dev>
@tbitcs tbitcs merged commit d82a358 into develop May 4, 2026
12 checks passed
@tbitcs tbitcs deleted the feat/byoe-endpoints-store branch May 4, 2026 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant